Quảng Trị Province
OverThink: Slowdown Attacks on Reasoning LLMs
Kumar, Abhinav, Roh, Jaechul, Naseh, Ali, Karpinska, Marzena, Iyyer, Mohit, Houmansadr, Amir, Bagdasarian, Eugene
We increase overhead for applications that rely on reasoning LLMs-we force models to spend an amplified number of reasoning tokens, i.e., "overthink", to respond to the user query while providing contextually correct answers. The adversary performs an OVERTHINK attack by injecting decoy reasoning problems into the public content that is used by the reasoning LLM (e.g., for RAG applications) during inference time. Due to the nature of our decoy problems (e.g., a Markov Decision Process), modified texts do not violate safety guardrails. We evaluated our attack across closed-(OpenAI o1, o1-mini, o3-mini) and open-(DeepSeek R1) weights reasoning models on the FreshQA and SQuAD datasets. Our results show up to 18x slowdown on FreshQA dataset and 46x slowdown on SQuAD dataset. The attack also shows high transferability across models. To protect applications, we discuss and implement defenses leveraging LLM-based and system design approaches. Finally, we discuss societal, financial, and energy impacts of OVERTHINK attack which could amplify the costs for third-party applications operating reasoning models.
COVE: COntext and VEracity prediction for out-of-context images
Tonglet, Jonathan, Thiem, Gabriel, Gurevych, Iryna
Images taken out of their context are the most prevalent form of multimodal misinformation. Debunking them requires (1) providing the true context of the image and (2) checking the veracity of the image's caption. However, existing automated fact-checking methods fail to tackle both objectives explicitly. In this work, we introduce COVE, a new method that predicts first the true COntext of the image and then uses it to predict the VEracity of the caption. COVE beats the SOTA context prediction model on all context items, often by more than five percentage points. It is competitive with the best veracity prediction models on synthetic data and outperforms them on real-world data, showing that it is beneficial to combine the two tasks sequentially. Finally, we conduct a human study that reveals that the predicted context is a reusable and interpretable artifact to verify new out-of-context captions for the same image. Our code and data are made available.
Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges
Van Dinh, Nguyen, Dang, Thanh Chi, Nguyen, Luan Thanh, Van Nguyen, Kiet
Vietnamese, a low-resource language, is typically categorized into three primary dialect groups that belong to Northern, Central, and Southern Vietnam. However, each province within these regions exhibits its own distinct pronunciation variations. Despite the existence of various speech recognition datasets, none of them has provided a fine-grained classification of the 63 dialects specific to individual provinces of Vietnam. To address this gap, we introduce Vietnamese Multi-Dialect (ViMD) dataset, a novel comprehensive dataset capturing the rich diversity of 63 provincial dialects spoken across Vietnam. Our dataset comprises 102.56 hours of audio, consisting of approximately 19,000 utterances, and the associated transcripts contain over 1.2 million words. To provide benchmarks and simultaneously demonstrate the challenges of our dataset, we fine-tune state-of-the-art pre-trained models for two downstream tasks: (1) Dialect identification and (2) Speech recognition. The empirical results suggest two implications including the influence of geographical factors on dialects, and the constraints of current approaches in speech recognition tasks involving multi-dialect speech data. Our dataset is available for research purposes.
Prototype-Based Interpretability for Legal Citation Prediction
Luo, Chu Fei, Bhambhoria, Rohan, Dahan, Samuel, Zhu, Xiaodan
Deep learning has made significant progress in the past decade, and demonstrates potential to solve problems with extensive social impact. In high-stakes decision making areas such as law, experts often require interpretability for automatic systems to be utilized in practical settings. In this work, we attempt to address these requirements applied to the important problem of legal citation prediction (LCP). We design the task with parallels to the thought-process of lawyers, i.e., with reference to both precedents and legislative provisions. After initial experimental results, we refine the target citation predictions with the feedback of legal experts. Additionally, we introduce a prototype architecture to add interpretability, achieving strong performance while adhering to decision parameters used by lawyers. Our study builds on and leverages the state-of-the-art language processing models for law, while addressing vital considerations for high-stakes tasks with practical societal impact.
Improved Sensor-Based Animal Behavior Classification Performance through Conditional Generative Adversarial Network
Zhao, Zhuqing, Ha, Dong, Damle, Abhishek, Dos, Barbara Roqueto, White, Robin, Ha, Sook
Many activity classifications segments data into fixed window size for feature extraction and classification. However, animal behaviors have various durations that do not match the predetermined window size. The dense labeling and dense prediction methods address this limitation by predicting labels for every point. Thus, by tracing the starting and ending points, we could know the time location and duration of all occurring activities. Still, the dense prediction could be noisy with misalignments problems. We modified the U-Net and Conditional Generative Adversarial Network (cGAN) with customized loss functions as a training strategy to reduce fragmentation and other misalignments. In cGAN, the discriminator and generator trained against each other like an adversarial competition. The generator produces dense predictions. The discriminator works as a high-level consistency check, in our case, pushing the generator to predict activities with reasonable duration. The model trained with cGAN shows better or comparable performance in the cow, pig, and UCI HAPT dataset. The cGAN-trained modified U-Net improved from 92.17% to 94.66% for the UCI HAPT dataset and from 90.85% to 93.18% for pig data compared to previous dense prediction work.
Sponge Examples: Energy-Latency Attacks on Neural Networks
Shumailov, Ilia, Zhao, Yiren, Bates, Daniel, Papernot, Nicolas, Mullins, Robert, Anderson, Ross
The high energy costs of neural network training and inference led to the use of acceleration hardware such as GPUs and TPUs. While this enabled us to train large-scale neural networks in datacenters and deploy them on edge devices, the focus so far is on average-case performance. In this work, we introduce a novel threat vector against neural networks whose energy consumption or decision latency are critical. We show how adversaries can exploit carefully crafted $\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to maximise energy consumption and latency. We mount two variants of this attack on established vision and language models, increasing energy consumption by a factor of 10 to 200. Our attacks can also be used to delay decisions where a network has critical real-time performance, such as in perception for autonomous vehicles. We demonstrate the portability of our malicious inputs across CPUs and a variety of hardware accelerator chips including GPUs, and an ASIC simulator. We conclude by proposing a defense strategy which mitigates our attack by shifting the analysis of energy consumption in hardware from an average-case to a worst-case perspective.